Download An MDCT-Based Psychoacoustic Model Co-Processor Design for MPEG-2/4 AAC Audio Encoder
The Psychoacoustic Model (PAM) is a very important role in MPEG-2/4 Advanced Audio Coding (AAC) encoding. It determines sound quality of a given encoder and influences a lot in computational complexity. This paper presents a new architecture design for MDCT-based PAM co-processor. This work is based on the dedicated hardware design which is different from traditional programmable approaches. Moreover, to reduce the complexity, we replace the calculations of spreading function with reduced fixed coefficients, and decrease the transform kernels from three to one unit.
Download Unsupervised Audio Key and Chord Recognition
This paper presents a new methodology for determining chords of a music piece without using training data. Specifically, we introduce: 1) a wavelet-based audio denoising component to enhance a chroma-based feature extraction framework, 2) an unsupervised key recognition component to extract a bag of local keys, 3) a chord recognizer using estimated local keys to adjust the chromagram based on a set of well-known tonal profiles to recognize chords on a frame-by-frame basis. We aim to recognize 5 classes of chords (major, minor, diminished, augmented, suspended) and 1 N (no chord or silence). We demonstrate the performance of the proposed approach using 175 Beatles’ songs which we achieved 75% in F-measure for estimating a bag of local keys and at least 68.2% accuracy on chords without discarding any audio segments or the use of other musical elements. The experimental results also show that the wavelet-based denoiser improves the chord recognition rate by approximately 4% over that of other chroma features.
Download Separation of musical notes with highly overlapping partials using phase and temporal constrained complex matric factorization
In note separation of polyphonic music, how to separate the overlapping partials is an important and difficult problem. Fifths and octaves, as the most challenging ones, are, however, usually seen in many cases. Non-negative matrix factorization (NMF) employs the constraints of energy and harmonic ratio to tackle this problem. Recently, complex matrix factorization (CMF) is proposed by combining the phase information in source separation problem. However, temporal magnitude modulation is still serious in the situation of fifths and octaves, when CMF is applied. In this work, we investigate the temporal smoothness model based on CMF approach. The temporal ac-tivation coefficient of a preceding note is constrained when the succeeding notes appear. Compare to the unconstraint CMF, the magnitude modulation are greatly reduced in our computer simulation. Performance indices including sourceto-interference ratio (SIR), source-to-artifacts ratio (SAR), sourceto-distortion ratio (SDR), as well as modulation error ratio (MER) are given.
Download Analysis and Synthesis of the Violin Playing Style of Heifetz and Oistrakh
The same music composition can be performed in different ways, and the differences in performance aspects can strongly change the expression and character of the music. Experienced musicians tend to have their own performance style, which reflects their personality, attitudes and beliefs. In this paper, we present a datadriven analysis of the performance style of two master violinists, Jascha Heifetz and David Fyodorovich Oistrakh to find out their differences. Specifically, from 26 gramophone recordings of each of these two violinists, we compute features characterizing performance aspects including articulation, energy, and vibrato, and then compare their style in terms of the accents and legato groups of the music. Based on our findings, we propose algorithms to synthesize violin audio solo recordings of these two masters from scores, for music compositions that we either have or have not observed in the analysis stage. To our best knowledge, this study represents the first attempt that computationally analyzes and synthesizes the playing style of master violinists.